Apertium-IceNLP: A rule-based Icelandic to English machine translation system
نویسندگان
چکیده
We describe the development of a prototype of an open source rule-based Icelandic→English MT system, based on the Apertium MT framework and IceNLP, a natural language processing toolkit for Icelandic. Our system, Apertium-IceNLP, is the first system in which the whole morphological and tagging component of Apertium is replaced by modules from an external system. Evaluation shows that the word error rate and the positionindependent word error rate for our prototype is 50.6% and 40.8%, respectively. As expected, this is higher than the corresponding error rates in two publicly available MT systems that we used for comparison. Contrary to our expectations, the error rates of our prototype is also higher than the error rates of a comparable system based solely on Apertium modules. Based on error analysis, we conclude that better translation quality may be achieved by replacing only the tagging component of Apertium with the corresponding module in IceNLP, but leaving morphological analysis to Apertium.
منابع مشابه
Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium
In this paper, we describe two methods developed for sharing linguistic data between two free and open source rule based machine translation systems: Apertium, a shallow-transfer system; and Grammatical Framework (GF), which performs a deeper syntactic transfer. In the first method, we describe the conversion of lexical data from Apertium to GF, while in the second one we automatically extract ...
متن کاملapertium-cy - a collaboratively-developed free RBMT system for Welsh to English
apertium-cy (http://www.cymraeg.org.uk) is a rule-based “gisting” machine translation system forWelsh to English, with both engine and data released under the GPL.We summarise the development of apertium-cy, evaluate its output, and discuss the advantages of a collaborative development model combined with rule-based MT for marginalised languages. 1. e Apertium platform apertium-cy is a “gistin...
متن کاملA Rule-based Shallow-transfer Machine Translation System for Scots and English
An open-source rule-based machine translation system is developed for Scots, a low-resourced minor language closely related to English and spoken in Scotland and Ireland. By concentrating on translation for assimilation (gist comprehension) from Scots to English, it is proposed that the development of dictionaries designed to be used within the Apertium platform will be sufficient to produce tr...
متن کاملAutomatic acquisition of Named Entities for Rule-Based Machine Translation∗
This paper proposes to enrich RBMT dictionaries with Named Entities (NEs) automatically acquired from Wikipedia. The method is applied to the Apertium English–Spanish system and its performance compared to that of Apertium with and without handtagged NEs. The system with automatic NEs outperforms the one without NEs, while results vary when compared to a system with handtagged NEs (results are ...
متن کاملThe Universitat d'Alacant hybrid machine translation system for WMT 2011
This paper describes the machine translation (MT) system developed by the Transducens Research Group, from Universitat d’Alacant, Spain, for the WMT 2011 shared translation task. We submitted a hybrid system for the Spanish–English language pair consisting of a phrase-based statistical MT system whose phrase table was enriched with bilingual phrase pairs matching transfer rules and dictionary e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011